Word | Frequency | Number of right neighbors | Number of left neighbors | Ratio |
---|---|---|---|---|
s | 53488 | 2097 | 1 | 2097.0000 |
t | 12138 | 418 | 1 | 418.0000 |
But | 29431 | 374 | 2 | 187.0000 |
Another | 2149 | 152 | 1 | 152.0000 |
re | 2333 | 127 | 1 | 127.0000 |
Both | 2891 | 103 | 1 | 103.0000 |
After | 7676 | 261 | 3 | 87.0000 |
By | 2775 | 165 | 2 | 82.5000 |
And | 13420 | 153 | 2 | 76.5000 |
ll | 1178 | 76 | 1 | 76.0000 |
Each | 1350 | 74 | 1 | 74.0000 |
A | 29083 | 1378 | 19 | 72.5263 |
ve | 1663 | 67 | 1 | 67.0000 |
Also | 2269 | 66 | 1 | 66.0000 |
An | 3966 | 306 | 5 | 61.2000 |
Despite | 1830 | 50 | 1 | 50.0000 |
Although | 2665 | 45 | 1 | 45.0000 |
All | 4417 | 130 | 3 | 43.3333 |
States | 1464 | 43 | 1 | 43.0000 |
During | 2447 | 41 | 1 | 41.0000 |
Word | Frequency | Number of right neighbors | Number of left neighbors | Ratio |
---|---|---|---|---|
U.S | 3338 | 1 | 52 | 0.0192 |
p.m | 3034 | 2 | 96 | 0.0208 |
it. | 1834 | 2 | 92 | 0.0217 |
don | 2266 | 1 | 43 | 0.0233 |
Corp | 822 | 2 | 79 | 0.0253 |
didn | 1728 | 1 | 38 | 0.0263 |
a.m | 1530 | 2 | 55 | 0.0364 |
doesn | 1225 | 1 | 26 | 0.0385 |
up. | 514 | 2 | 49 | 0.0408 |
isn | 834 | 1 | 24 | 0.0417 |
J | 315 | 1 | 22 | 0.0455 |
wouldn | 442 | 1 | 22 | 0.0455 |
aren | 453 | 1 | 21 | 0.0476 |
system. | 360 | 1 | 20 | 0.0500 |
O | 683 | 2 | 39 | 0.0513 |
responsible | 415 | 1 | 19 | 0.0526 |
able | 1570 | 1 | 19 | 0.0526 |
Gov | 242 | 1 | 19 | 0.0526 |
willing | 328 | 1 | 19 | 0.0526 |
wasn | 824 | 1 | 18 | 0.0556 |
In this subsection, we compute the ratio of the number of right neighbors and the number of left neighbors. Again, we look for words with extreme ratios:
Data for first table:
select word,w.freq,aa.cnt, bb.cnt,aa.cnt/bb.cnt as r from words w, (select w1_id,count(c.w2_id) as cnt from co_n c where w1_id>100 group by w1_id) aa, (select w2_id,count(c.w1_id) as cnt from co_n c where w2_id>100 group by w2_id) bb where w_id=aa.w1_id and aa.w1_id=bb.w2_id order by r desc limit 20;
Diagram data:
select aa.cnt, bb.cnt from (select w1_id,count(c.w2_id) as cnt from co_n c where w1_id>100 group by w1_id) aa, (select w2_id,count(c.w1_id) as cnt from co_n c where w2_id>100 group by w2_id) bb where aa.w1_id=bb.w2_id;
5.1.7.1 Number of NN co-occurrences vs. Frequency I
5.1.7.2 Number of NN co-occurrences vs. Frequency II